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Abstract 

In analyses of algorithms, a substantial amount of effort has often to be spent on the discussion 
of special cases. For example, when the analysis considers the cases X < Y and X > Y separately, 
one might have to be especially careful about what happens when X = Y. On the other hand, 
experience tells us that when a yet unregarded special case of this kind is discovered, one nearly 
always finds a way to handle it. This is typically done by modifying the analysis and/or the 
algorithm very slightly. 

In this article we substantiate this observation theoretically. We concentrate on deterministic 
algorithms for weighted combinatorial optimization problems. A problem instance of this kind 
is defined by its structure and a vector of weights. The concept of a null case is introduced as 
set of problem instances whose weight vectors constitute a nowhere open set (or null set) in the 
space of all possible weight configurations. An algorithm is called robust if any null case can be 
disregarded in the analysis of both its solution quality and resource requirements. 

We show that achieving robustness is only a matter of breaking ties the right way. More 
specifically, we show that the concept of symbolic perturbation known from the area of geometric 
algorithms guarantees that no surprises will happen in null cases. We argue that for a huge class 
of combinatorial optimization algorithms it is easy to verify that they implicitly use symbolic 
perturbation for breaking ties and thus can be analyzed under the assumption that some arbitrary 
null case never occurs. Finally, we prove that there exists a symbolic perturbation tie breaking 
policy for any algorithm. 
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1 Introduction 



Let / : R" — > R be a continuous function and let C C R" be a null set. It is a well-known fact that 
sup{/(» | x e R" \ C} = sup{/0) | x e R"}. 

Imagine that A (.*:), the strictly positive value of the solution computed by an algorithm A that takes 
vectors of K" as the input, is continuous. Assume that the value of the optimal solution OPT(x) is 
as well continuous. Let there be a proof that A is a c-approximation which holds for all almost every 
input, i.e. OPT(x)/A(x) < c for each x £ R" \ C, where C is a null set. Then, asOPT/Ais continuous 
as well, we know that A is a c-approximation also for inputs from C. 

C is what we refer to with the term null case. If in the above scenario it is known a priori that 
A and OPT are continuous, one can conveniently disregard some arbitrary null case in the proof for 
OPT(x) /A(x) < c. For example, one could assume that x t ^ and Xj ^ Xj for each i ^ j. We say that 
A is robust. 

Motivation. This work is motivated by the desire to banish the nasty corner cases that deface 
otherwise beautiful proofs so often. When some proof distinguishes between cases like X < Y and 
X > Y, one often has to be very careful to which case to assign the special situation X — Y, and 
sometimes this event even has to be analyzed as a third case. One type of equality is especially 
annoying: when there is more than one optimal solution to a combinatorial problem. A common 
experience is that one wants to prove a certain property of the optimal solution, but at some point 
one realizes that it can only be shown that there exists one optimal solution having the property. So 
much more elegant proofs could be - in a world without equality! 

Inspired by the observation that degenerate cases of the kind just described are null sets in the 
space of weight configurations, we seek to investigate to what extent the above continuity argument 
can be exploited in the area of combinatorial optimization. 

The main obstacle is that one cannot assume that A(x), the value of the solution computed 
by algorithm A, is continuous. Consider for example the problem PARTITION, where n numbers 
x\,...,x n have to be assigned to two sets Si , 52 such that 

E (i) 

is minimized. Approximation algorithms for this NP-hard optimization problem usually have some 
points of discontinuity. These points lie on the border where the algorithm switches its output from 
one partition to to another. In contrast, the value of the optimal solution OPT is continuous, because 
at the points where OPT switches its output, say from (S\,S2) to (S\,S' 2 ), the cost of (Si,S2) equals 
the cost of (Sj,^). However, this observation by itself does not allow to disregard some null case 
while proving that A is optimal, because the continuity of A must be known a priori. 

Our contribution. We identify a sufficient precondition for robustness that is much weaker than 
continuity. As the starting point serves the observation that the source of discontinuity in algorithms 
is conditional branching, as there is no other way to compute non-continuous functions in common 
machine models. Our point of view on algorithms is the decision tree model. In this model an 
algorithm branches by comparing the value v(w) of some continuous branching function v : Q" — > Q 
with 0, where w is the weight vector of the problem instance. The algorithm takes one branch if 
v(w) < and the other if v(w) > 0; ties are broken in some well-defined manner. 

It will become clear that in this model the points of discontinuity are exactly (a subset of) the 
cases where there is a tie that has to be broken by the algorithm. We show that a generalization 



1 



of the symbolic perturbation technique proposed in the context of geometric algorithms j4] |7] (see 
Section|7]for a discussion of related work) guarantees robustness. 

We complement our study with a theorem which states that there exist a symbolic perturbation 
policy for any algorithm. That theorem, although not being constructive (it might be arbitrarily 
complicated to find a policy), supports our belief that the concept of symbolic perturbation is widely 
applicable. 

We wish to emphasize that this work proposes the employment of symbolic perturbation in anal- 
yses of algorithms. Our goal is to simplify analyses by recognizing the situation where algorithms 
implicitly make use of symbolic perturbation. We believe that for a huge class of algorithms this 
holds true and is rather simple to verify. Once one has shown that an algorithm employs symbolic 
perturbation, one can choose an arbitrary null case and assume during the analyses that it never 
occurs. In other words, we understand symbolic perturbation as a proof technique rather than an 
algorithm design technique. 

Paper organization. In the following section we provide the mathematical basics of our result. 
The main theorem of that section is a generalized version of the fact pointed out in the first two 
lines of this article. Then, in Section [3] the model for combinatorial problems and algorithms is 
introduced. In Section [4] we formally define what a null case is and also introduce our notion of 
symbolic perturbation. After that, we show that algorithms using symbolic perturbation are robust. 
The practicability of this result is demonstrated by a complete example given in Section [5] and in 
Section [6] we prove that there exists a symbolic perturbation policy for any algorithm. Section [7] 
summarizes the findings and discusses directions for future research. Here we also compare our 
approach to other areas where symbolic perturbation has been proposed. 

2 Mathematical background 

In the introduction we have considered an algorithm that takes vectors from R" as the input, and 
a null case has been a null set in that space. However, we want to avoid that the results are only 
shown for the rather abstract computational model where weights can be any rational number. We 
therefore assume that the numbers operated with belong to a subset gel. Depending on the model 
of computation, Q can be equal to R, Q can be the set of rational numbers, or the set of algebraic 
numbers. Although not necessary, it makes sense to assume that Q is dense in itself, i.e. for any 
x £ Q and e > there is some y £ Q with < \x — y\ < E. The assumption is sensible because 
isolated points never belong to null cases. 

The notion of null sets has been used in the introduction for illustrating the ideas, but in the 
remainder of the paper no measure theory is employed. It turns out that null cases can as well be 
defined, in a simpler and even more general way, in terms of subsets of the metric space (Q",d). 
In principle, our results hold regardless of the metric d, but for simplicity we assume that d is the 
Euclidean distance. 

The following definitions can be found in any introductory analysis textbook. A neighborhood 
of a point x £ Q" is defined as {y £ Q" \ d(x,y) < e} for some e > 0. A subset of Q" is defined to be 
open if it contains some neighborhood of each of its elements. A function / : Q" D X — > Q is said to 
be continuous at x £ X if for any e > there exists a 5 > such that d(x,y) < 8 => \f(x) — f(y) \ < £ 
for every y £ X. The function / is continuous if it is continuous at every point of its domain. The 
following definition corresponds to the notion null sets used in the introduction. 

Definition 1 A set X C Q" is nowhere open if it contains no non-empty open subset. 
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The correspondency is given by the fact that in the Euclidean space M", with a measure defined in 
the standard way, any null set is nowhere open. Definition Q] is even more general: e.g. the set of 
irrational numbers is nowhere open in K although it is not a null set. An alternative definition would 
be to say that a set is nowhere open if it has no interior points. 

For the observation pointed out in the first two lines of the paper to hold, it is crucial that the 
domain of / satisfies a certain condition. For example, if the domain of / was a null set itself the 
claim would not hold. We formulate the requirement as follows. 

Definition 2 A set X C Q" is called semi-open if any neighborhood of its members contains a non- 
empty open subset ofX. 

Remark that each open set, including the empty set, is semi-open. Intuitively, a semi-open set is 
the union of an open set with a subset of its boundary. We are now ready to translate the proposition 
from the first two lines of this article into the terms just introduced. 

Lemma 1 Given a non-empty semi-open set X C Q n , a nowhere open set C C X, and a continuous 
function f : X — > Q, it holds that sup{/(x) | x G X \ C} = sup{/(jc) | x G X}. 

Proof. Consider any point xGC and any e > 0. From Definition|2]follows that the e -neighborhood 
of x contains a non-empty open subset U. As C is nowhere open, it cannot contain every point of U, 
i.e. there is a point x e EX\C with d(x, e) < S. 

As a consequence, there is a sequence (#,■), -g^ with x\ 6 X \C for all ; G N and lim,^ooX; = x. From 
the continuity off follows that lim,^oo/(x,) = f{x). which implies that f(x) < sup{/(y) | y G X\C}. 

□ 

As outlined in Section Q] one cannot hope to achieve that the output of algorithms is continuous 
in the input. We now specify a weaker property than continuity. The subsequent theorem ensures 
that this property is sufficient for robustness, and in the subsequent sections it will be shown that 
suitable tie-breaking strategies effectuate that the output of algorithms satisfy it. 

Definition 3 Let X C Q" be semi-open. A function f : X — > Q is called locally continuous if for any 
x G X there is a semi-open set M C X with x G M and f\M is continuous. 

Theorem 1 Let X C Q" be semi-open, let f : X — > Q be locally continuous, and let C C X be a 
nowhere open set. Then sup{/(x) | x € X \ C} = sup{/(x) | x G X}. 

Proof. Each x G C belongs to a semi-open subset M C X where f\M is continuous. From Lemma[T] 
follows that f(x) < sup{/(y) | y G M \ C} < sup{y f(y) G X \C}. □ 

3 Weighted problems and resistant algorithms 

In this work we intend to make provable and general statements about algorithms for weighted com- 
binatorial problems. Therefore, we need a precisely defined model that is general enough to capture 
a large number of actual problems and algorithms. For simplicity we only consider minimization 
problems here, but all our results can be shown for maximization problems in exactly the same way. 
For an example of how an actual algorithm for an actual problem fits into the model we refer to 
Section[5] 
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A weighted combinatorial optimization problem P is a set of problem structures. Each problem 
structure S G P is characterized by the number n of weights, the semi-open domain W C Q" of 
possible weight vectors, the set L of feasible solutions, and the evaluation function cost : (W,L) — > R. 
For any fixed / G L, cost(-,Z) is continuous. A problem instance of P is a pair (5,w) with S = 
(n,W,L, cost) GPandvvGW. The objective is to determine a solution / £ L that minimizes cost(w,/). 

A deterministic algorithm A for some problem P is formally defined as follows: For each struc- 
ture S = (n,W,L, cost) G P there is a corresponding finite binary decision tree T whose leaves are 
associated with solutions from the set LU {//}, where If ^ L is a special solution for modeling the 
failure of A to compute a feasible solution. Each internal node of T is associated with a continuous 
branching function v : W — > Q. Given a problem instance (S, w) as the input, the algorithm traverses 
the tree T corresponding to S starting at its root. At each encountered internal node v it evaluates 
v(w). If v(w) < the left branch is taken, and if v(w) > the algorithm takes the right branch. In the 
situation of v(w) — 0, called a tie, either the left or the right branch is taken. The decision is made 
by some deterministic tie breaking policy. When the algorithm finally reaches a leaf, it returns the 
solution / G LU {If} associated with it. 

The reader may have noticed that the definitions include two continuity assumptions. First, we 
assume that the cost function is continuous for any fixed solution. We are convinced that this barely 
restricts the applicability of our model. In optimization problems as we know them, the cost function 
is typically very simple (see e.g. Equation[T]in SectionQ}. A straightforward consequence of the first 
continuity assumption is that the value of the optimal solution is continuous as well. 

Lemma 2 Let P be a weighted combinatorial optimization problem, and let OPT{S, w) denote the 
value of the optimal solution for instance (S, w). For any fixed structure S G P it holds that OPT(S, •) 
is continuous. 

Proof. Let S = (n,W,L,cost). By definition, OPT{S,w) = min{cost(Z, w) \ I G L}. As the min- 
operator is continuous, OPT is calculated as a concatenation of continuous functions. □ 

The second continuity assumption concerns the branching functions. Again, we believe that the 
behavior of most algorithms can be modeled using only continuous branching functions. One could 
justify the assumption theoretically by arguing that in machine models the only source of disconti- 
nuity is conditional branching. 

We remark that the model only captures algorithms that always terminate. Some comments about 
how infinite loops could be modeled can be found in Section|7] 

4 Null cases and robust algorithms 

In this section we define null cases formally. We then describe the concept of symbolic perturbation 
and show that this tie breaking manner leads to algorithms that can be analyzed under the assumption 
that some arbitrary null case does not occur. As mentioned in the introduction, this is already the 
main result of this paper, because we refrain from proposing ways to compute symbolic perturbation 
explicitly. 

Definition 4 Let P be a weighted combinatorial optimization problem. A null case is a set of 
instances ofP where for any fixed structure S = (n,W,L,cost) G P the set {w£ff (S, w) G r rf} is a 
nowhere open set in W. 
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We need to introduce some more mathematical concepts. For x E Q, sgn(x) is defined as x/\x\ if 
x ^ and sgn(O) = 0. 

Definition 5 Let h : R — * Q" be a function such that h(0) = and h is continuous at 0. Let further 
W C Q" be semi-open and w EW. 

We say that W continues into direction h at w if there is some 8 > such that for any < a < 8 
it holds that some neighborhood ofw + h(a) is contained in W. 

IfW continues into direction h at w and f : Q" — > Q is continuous, we say that f is increasing 
(constant, decreasing) into direction h at r, if there is some 8 > such that for each < a < 8 it 
holds that sgn{f — f(w)) is constantly 1 (0,-1) in some neighborhood of w + h(a). 

Note that the condition of / being constant in some neighborhood of w + h(a) is only required 
explicitly in the case of / being constant into direction h. In the other cases it suffices to demand that 
f(w + h{a)) > (< 0) for any < a < 8, the rest follows automatically from the continuity of /. 

Definition 6 Let A be an algorithm for problem P. A symbolic perturbation tie breaking policy for 
A is given as follows. 

For each problem instance (S,w) with S = (n,W,L,cost) there is a function h : R — > Q" such that 
W continues into direction h at w and any branching function in the tree corresponding to S is either 
decreasing, constant, or increasing into direction r at x. 

In case of a tie at node v, algorithm A takes the left branch when v is decreasing into direction r 
at x. Otherwise, the right branch is taken. 

For proving that A uses symbolic perturbation, one has to specify a suitable function h for each 
instance and then show that the algorithm breaks ties according to it. This might sound more involved 
than it actually is: in most cases, a small number of simple functions work out for all instances. 

The following lemma can be considered as the main technical lemma of this section, because it 
establishes the link between symbolic perturbation and the property of robustness. 

Lemma 3 Let A be an algorithm for problem P. If A employs symbolic perturbation, then any node 
u in a decision tree traversed by A for some structure S = (ft, W,L,cost) E P has the property that it 
is encountered by A for a semi-open subset ofW. 

Proof. Let X CW be the set of weight vectors where node u is traversed by A, and let w G X. Let 
h be the direction function corresponding to instance (S,w) according to Definition|6] We show that 
X continues into direction h at w, which directly implies the lemma. 

We use induction over the tree depth of u and therefore consider u to be the tree root in the base 
case. Here, X = W and the claim holds by definition. For the induction step, assume that the claims 
have already been shown for some node v, and consider a child node u of v. We only consider the 
case where u is the right child of v, because the argumentation for the opposite case is analogous. 
Let X and X' be the set of weight vectors where v and u is reached by A, respectively. Now consider 
any vector w £ X' . We distinguish between three different cases that can occur. 

Case 1: v(w) > 0. Then, due to the continuity of v, there is some neighborhood of w that is 
completely contained in X'. The claim follows from the induction hypothesis. 

Case 2: v(w) = and v is constant into direction h at vtQ. By definition, there is a 8 > such 
that for each < a < 8 it holds that v(U a ) = for some neighborhood of U a of a + h(a). As these 

1 Note that this case does not even occur in the setting where u is the left child of v. 
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neighborhoods are open sets, each point y from any U a has a neighborhood where v is constantly 0, 
and so v must be constant into any direction at y. Consequently, the algorithm takes the branch to u 
for weight configuration y as well. In other words, the neighborhoods U u are completely contained 
in X 1 , which proves that X' continues into direction h at w. 

Case 3: v(w) = and v is increasing into direction r at w. By definition, this means that there 
is some 8 > such that for any < a < 8 there is some e a with v(x) > for any x £ W with 
d(w + h(a),x) < E a . The induction hypothesis states that there is some 8' such that for any < a < 8' 
there is some e' a with x 6 X for any x 6 W with d(w + h(a),x) < e' a . These two propositions imply 
that there is some 8" : = min{5,5'} such that for any < a < 8" there is some e'J := min{e fl ,e^} 
with x EX' for any x € W with d(x, w + h(a)) < e'^. In other words, v continues into direction hatw 
inX'. □ 

With one more simple lemma, the main theorem follows rather straightforwardly. 

Lemma 4 Let f : Q" — > K be locally continuous and let g : K — > M be continuous. Then go f is 
locally continuous. 



Proof. Each point x in the domain of / is contained in some semi-open subset X such that f\X is 
continuous. As g is continuous, so is (gof)\X. □ 

Theorem 2 Let P be a weighted combinatorial optimization problem and let A be a deterministic 
algorithm for it using symbolic perturbation. Then A is robust, i.e. the following properties hold for 
any null case c £. 

a) If A returns a solution from Lfor all problem instances except those in c 1§, then this also holds 
for the instances in c £. 

b) If A 's runtime (memory consumption) only depends on the taken tree path and is bounded by 
some function t : P — > Nfor all problem instances except those in , then this bound also holds 
for the instances in c &. 

c) If the cost of the solution computed by A is bounded by a function b : P x Q N — > K that is 
continuous in the second parameter, then this bound also holds for the instances from c £. 

From part (c) of the theorem one can derive further properties of algorithms using symbolic pertur- 
bation. These properties are established by Lemma|2] 

Corollary 1 Let P be a weighted combinatorial optimization problem and let A be a algorithm using 
symbolic perturbation. Then, for any null case %f, 

d) if A is a optimal for all inputs except those in c €, then A is also optimal for inputs from c £. 

e) if A is a c-approximation for all inputs except those in c £, where c : P x Q N — * R is a function 
that is continuous in the second parameter, then A is also a c-approximation for inputs from 
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Proof of Theorem [2j Let ^ be an arbitrary null case. For proving part a) and b), imagine that A 
traverses the tree path to / when processing an instance (S,w) G c €. Due to Lemma[3] the same tree 
path is traversed for all weights from some semi-open set X that contains w. As X ^ 0, it follows that 
X has some non-empty open subset, so not every element of X can be a member of {x | (S,x) £ 
Therefore, there is an instance (S,x) ^ ^ where A behaves like it does in instance (S,w) and thus 
returns the same solution and has the same resource requirements. 
It remains to prove proposition c). We have that 

Lemma[3]implies that A(S, •) is locally continuous. As b(S, •) is continuous, we can apply Lemma|4] 
two times for showing that A(S, w)/b(S, w) is locally continuous in w. Theorem[T]gives 

□ 



5 An example 

This section intends to demonstrate how the concept of symbolic perturbation helps to simplify the 
analysis of algorithms. As our example we consider the classical Hu-Tucker Algorithm for the 
Alphabetic Tree Problem. It is one of the ancient examples of a simple problem admitting a simple 
algorithm for solving it, whereas the correctness proof for the algorithm is highly complicated. 

In the Alphabetic Tree Problem we are given a sequence of n letters b\,...,b n having certain 
weights w\,...,w n € Qq . The objective is to find a binary tree D whose leaves in left-to-right order 
are exactly b\,...,b n minimizing 

n 

£ wi • depth(Z> ( ) , (2) 

;=i 

where depth(£>,) is the distance between £>, and the root of D. 

We first show how the Alphabetic Tree Problem fits into the model of combinatorial optimization 
problems introduced in Section [3] Let P denote the Alphabetic Tree Problem. Then P contains a 
problem structure S n = («, (Qq )",L„,cost„) for each natural number n. In other words, the problem 
structure is determined by the number of letters, and the domain of the weights is such that each 
individual weight can be any nonnegative rational number. The solution set L n contains all binary 
trees having n leaves, and the cost function assigns the weighted tree depth as defined in Equation[2] 
It is straightforward to see that the domain of weights is semi-open and that the cost functions are 
continuous for any fixed solution, so the Alphabetic Tree Problem fits well into our definition of 
combinatorial optimization problems. 

There is a straightforward dynamic programming method solving the problem in time <9(n 3 ). Hu 
and Tucker |2| were the first to derive an (9(«log«) time optimal algorithm. Their method, denoted 
as H in the following, resembles the well-known Huffman coding scheme 0. 

In the first phase, H maintains a sequence of tree nodes, both internal nodes and leaves. Two 
nodes are combinable if there are only internal nodes between them in the sequence. In each step, 
H combines the combinable node pair having minimum total weight. Ties are resolved by choosing 



7 



among the candidate pairs the one consisting of the leftmost possible nodes. This tie-breaking policy 
is commonly called alphabetic ordering. Combining two nodes means to make them the left and 
right children of a new internal node. When two nodes are combined, they are both removed from 
the working sequence, the new internal node takes the former position of its left child, and the weight 
of the new node is the sum of its children's weights. The first phase ends as soon as there is only one 
node left. 

The first phase might produce a tree, say D' ', that does not satisfy the ordering restriction, see the 
instance below. In the second phase, H constructs a tree D where the depth of any leaf is the same as 
in D'. The second step is very simple to implement, we do not go into details here. 

One of the crucial points is to prove that the second phase always succeeds, i.e. the first phase 
always produces a tree D' that can be turned into an alphabetic tree. Consider for example the five 
leaves bi,b2,bi,b4,bs. D' could combine the pairs (^2,^3) and (^1,^4) into two internal nodes bzi 
and Z?i4. Then it could combine (b\4,bs) into &145, and finally (^145,^23)- There is certainly no al- 
phabetic tree where bi,£>2,£>3,£>4,£>5 have depth 3,2,2,3,2, respectively. This example demonstrates 
that tie-breaking matters, because with the wrong tie-breaking policy the tree D' could have been 
constructed from an instance where w\ =W2 = W3 = w\ = W5. 

The second crucial point to prove is that the tree constructed by H is optimal. Up to today, no 
simple proof for the correctness of the Hu-Tucker Algorithm is known. The concept of symbolic 
perturbation still does not turn the proof into a truly simple one, but at least the consideration of ties, 
which plays a important role in the proof given by Hu and Tucker, can be set aside. 

This is how Algorithm H fits into the model of deterministic algorithms introduced in Section[3j 
the decision which two nodes to combine in the first phase can be done by repeatedly comparing 
sums of node weights. This is equivalent to comparing the difference of node weight sums with 
0. For example, when H wants to decide which of the pairs (b\ ,^2), (^2,^3) has the smaller total 
weight, it would compare (pi + bj) — (b\ + ^2) with 0. We assume that the subtrahend is always the 
node pair having the leftmost left component; if the left components of the pairs are identical, the 
pair having the leftmost right component is chosen as the subtrahend. Note that this viewpoint is still 
valid if the comparisons are made inside some sophisticated data structure (as required for achieving 
the (9(«log«) runtime). Each branching function is a linear combination 

n 

v(w) = ^CjWj with Cj £ { — 1,0, 1} for = 1, . . . ,« and (ci, . . . ,c„) ^ (0, . . . ,0) (3) 

i=l 

of weights. Therefore, the branching functions are clearly continuous. It can further be assumed that 
after Phase 1 the leaves of the decision tree are reached, because the computations in Phase 2 are 
completely determined by the tree D' computed in Phase 1 and do not further depend on the weights. 

Now we want to show that the tie-breaking of H can be described as symbolic perturbation. Here 
only one direction function h [a ) : = a ■ (2" , 2"~ 1 , . . . , 2 1 ) works out for each possible configuration of 
n leaf weights. It is straightforward to observe that (Qq)" continues into direction /; at each point. 
It is also not hard to see that the branching functions of H are always increasing or decreasing into 
direction r, depending on the algebraic sign of the first (leftmost) coefficient with c,- ^ 0. 

As each internal node takes the position of its left child, the nodes in the sequence maintained 
by H are always ordered by the position of their leftmost descendant leaf in the initial sequence. 
Together with the way the subtrahend is chosen among the two pairs when comparing them, it follows 
that the first nonzero coefficient in Equation[3]is always -1. In other words, v always decreases into 
direction r at w, and therefore H behaves according to r when choosing the leftmost pair. 
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It follows that H employs symbolic perturbation and therefore can be analyzed under the as- 
sumption that some arbitrary null case never occurs. For the analysis of H, and also for the analyses 
of many other algorithms, it is the most convenient to assume that all weights are non-zero, ties 
never occur during the execution, and there is exactly one optimal solution. Those scenarios can be 
described as the set of roots of a finite family of non-constant linear functions, and it is not hard to 
observe that such sets are nowhere open. 

6 Proof of existence 

In this section we shall show that the technique of symbolic perturbation is not limited to a certain 
subclass of algorithms, but can in principle be used in order to make any algorithm robust. We remark 
that the result is not constructive: only the existence of a symbolic perturbation tie breaking policy is 
proven, not the computability. 

Theorem 3 There is a symbolic perturbation tie breaking policy for any deterministic algorithm for 
a combinatorial optimization problem. 

The central argument of the proof is provided by the following lemma. 

Lemma 5 Let X C Q" be a non-empty open set and let f : X — > Q be a continuous function. Then 
there exists a non-empty open subset Y C X with sgn(/(T)) being constantly 0, —1, or 1. 

Proof. If f(X) is constantly 0, we are done. Otherwise, there is some x G X with f(x) ^ 0, say 
w.l.o.g. f(x) > 0. The continuity of / implies that each element from some e -neighborhood of x has 
a positive image. Furthermore, X contains some e' -neighborhood of x because it is open. Y being 
the min(e, e')-neighborhood of x satisfies the desired property. 

Proof of Theorem|3j Let A be an algorithm for problem P. For each instance (S,w),S= (n,W,L, cost) G 
P, we have to prove the existence of a direction function h satisfying the demands from Definition^ 
Let (vi,... ,v m ) be an enumeration of all branching functions in the tree traversed by A for input 
(S,w). 

Consider an infinite sequence of open sets (J/,) iG n sucn that G W for each i G N and for any 
£ > there is some ;'o G N such that Uj is completely contained in the e-neighborhood of w for each 
i > /(). Such a sequence exists because W is semi-open. 

In the following we modify that sequence subject to the goal that for each vj, 1 < j < m, it holds 
that sgn(v/(U£Li Ui)) is constantly 0, 1, or -1 and still each member of the sequence is open. 

Assume that this goal has already been achieved for vi , . . . , Vj- \ . Any sequence member Ui is 
open, so Lemma [5] implies that for any i G N there is a subset U[ C Ui such that sgn(v/ (£//)) is 
constantly - 1 , 0, or 1 . Therefore, there has to be a A: G { — 1,0,1} with the property that for each i G N 
there is an i' > i with sgn(v/(J/i)) = k. 

We now modify the sequence (t/,), e N m two steps. First, the sequence is replaced with (Uj)^. 
Then, each component Uj with sgn(v/({//)) ^ k is removed. Both steps preserve the desired property 
for the functions v\ , . . . , v,_i, and the property is established for vy. 

Having constructed the sequence of open sets, we define the direction function h in a pragmatic 
way: For any a > we know that there is some member Uj of our sequence that is completely 
contained in the a-neighborhood of w. We choose an arbitrary interior point x a from Uj and assign 
h(—a) = h(a) = x a — w. Doing this for any value of a and assigning h(0) = defines h. 
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The function h is continuous at 0, because d(h(e),h(Q)) < e always holds. By this construction, 
W continues into direction h at w, and each branching function Vy is either decreasing, constant, or 
increasing into direction h at w. □ 



7 Summary, discussion of related and future work 

In this work we have proposed symbolic perturbation as a proof technique for optimization algo- 
rithms. We believe that in many involved analyses it will pay off to first show symbolic perturbation 
and then assume that some arbitrary null case does not occur. The concept allows theoreticians to 
concentrate on the interesting points of proofs, not spending too many efforts on the boring corner 
cases. Typical examples of null cases are: 

• there are ties that the algorithms has to break. 

• some weights are zero. 

• there is more than one optimal solution. 

Here is the cooking recipe for verifying that a case ^ can be excluded from the analysis of 
algorithm A for problem P. 

1. Verify that P fits into the model given in Section|3] In particular, verify that the cost functions 
are continuous in the weights and that the weight domains are semi-open. 

2. Verify that A fits into the model given in Section [3] In particular, verify that the branching 
functions are continuous. 

3. Find direction functions h with which the requirements in Definition [6] are satisfied. When 
the algorithm employs alphabetic ordering to break ties like algorithm H in Section [5] does, 
functions of the type h{a) = a ■ (2",2" _1 , . . . ,2 ! ) are often a good choice. 

4. Verify that c € is a null case. It suffices to show that any instance in ^ can be turned into an 
instance not in ^ by shifting the weight vector by some infinitesimally small amount. The fact 
that the set of roots of any finite family of non-constant polynomials is nowhere open might 
also be helpful. 

Generalizations. We remark that the results from this work cannot be used in general to exclude 
null cases from the proof that an algorithm terminates for all instances. An algorithm not terminating 
could be modeled by an infinite path in the tree T. For example, the depth m node on that path could 
ask whether w — 1 /m > 0, where w £ Qq is some weight. If yes, the algorithm would leave the 
infinite path. Consequently, it terminates for all instances except the null case w — 0, no matter how 
ties are broken. 

Another way to model infinite loops is to generalize the decision trees traversed by an algorithm 
to finite directed graphs. The theorems in this work generalize to that model, and here also the 
termination of robust algorithms can be proved disregarding some arbitrary null case. However, 
algorithms like the one described in the preceding paragraph cannot be modeled using only finitely 
many branching functions. 
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Discussion of related results. The technique of symbolic perturbation has been proposed in a num- 
ber of different contexts. One application is the treatment of degenerate cases in implementations of 
the Simplex algorithm for solving linear programs. When two or more vertices of the polyhedron 
defined by the linear constraints coincide, it might happen that (a naive implementation of) Sim- 
plex runs into an infinite loop. Explicit symbolic perturbation is one of the methods that have been 
proposed in order to overcome that problem |6l . 

The conceptual difference to our results is that we give a method for recognizing implicit sym- 
bolic perturbation rather than proposing a way to explicitly computing such a tie breaking policy. It 
is though worth mentioning that Simplex can be described using only finitely many branching func- 
tions for any problem instance, and therefore any symbolic perturbation method for it guarantees 
termination. 

Another application area of symbolic perturbation are geometric algorithms. Here one dis- 
tinguishes between problem-dependent and algorithm-dependent degeneracies Q|5|. A problem- 
dependent degeneracy can be described as one in the input, e.g. three points located on the same 
straight line in the convex hull problem. One could also say that problem-dependent degeneracies 
are points of discontinuity of the function mapping the input space to the solution space. Algorithm- 
dependent degeneracies occur when some branching function of the algorithm evaluates to zero. 
Among others, Yap |7] has proposed a method to break ties by symbolic perturbation that can be 
plugged into any algorithm whose branching functions are polynomials. The method has later been 
extended to rational functions by Neuhauser (4). However, in a number of other articles it is argued 
that perturbation is not the silver bullet for all kinds of problems with geometric degeneracies (HE). 
First, because it incurs extra resource requirements, and for most problems there are cheaper ways to 
deal with degenerate cases. Second, because problem-dependent degeneracies cannot be handled in 
an appropriate way. 

In the area of optimization problems Lemma |2] ensures that problem-dependent degeneracies do 
not exist. Therefore, optimization algorithms with certain types of branching functions can be made 
robust using the methods from QUI. It is of course questionable if the runtime of an algorithm 
should be increased just for the sake of a simpler analysis, but it might still be an option if one is 
satisfied with an algorithm having any polynomially bounded runtime. In our opinion, however, 
explicit perturbation in this context unnecessarily moves theory further away from practice, and one 
should preferably show implicit symbolic perturbation. 

Directions for future work. The research area of robust algorithms yields a number of interesting 
open problems. Firstly, it is currently not known if results similar to ours can be shown for models 
of inexact computation. A related question addresses combinatorial problems where the weights can 
have only integer values. Secondly, this work has addressed only deterministic algorithms, so can 
the results be generalized to nondeterminism? Finally, there might be situations where an explicit 
symbolic perturbation method can be plugged into algorithms without increasing the runtime, which 
makes it possible to disregard null cases even without having proved symbolic perturbation before. 
Identifying these situations is another issue for future research. 

Acknowledgement. The author wishes to thank Takeaki Uno for a number of helpful comments 
that have been the inspiration for a substantial improvement of the results. 
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